Entry Name:  "UBA-Barretto-MC1"

VAST Challenge 2015
Mini-Challenge 1

 

 

Team Members:

Alfredo Barretto, University of Buenos Aires, acbarrettomdp@gmail.com PRIMARY

Nelson Amaya , University of Buenos Aires, nelmaya@gmail.com

Juan Orlowski, University of Buenos Aires, orlowski@agro.uba.ar

Student Team:  YES

 

Did you use data from both mini-challenges?  NO

 

Analytic Tools Used:

Tableau

Excel

Sql Server

Infostat

Spss

 

Approximately how many hours were spent working on this submission in total?

60

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete? YES

 

 

Video:

index_archivos\mc1_video.wmv

 

 

Questions

MC1.1Characterize the attendance at DinoFun World on this weekend. Describe up to twelve different types of groups at the park on this weekend. 

a.       How big is this type of group?

b.      Where does this type of group like to go in the park?

c.       How common is this type of group?

d.      What are your other observations about this type of group?

e.      What can you infer about this type of group?

f.        If you were to make one improvement to the park to better meet this group’s needs, what would it be?

We selected 8 big groups in order to describe most popular patterns of attractive attendance. Additionally, we have also selected 4 small groups in order to describe uncommon patterns that could be related to the crime.

Among the big groups we have:

Clusters 17, 11 and 2 are composed by 668, 458 and 557 IDs respectively. They mainly showed preference by Thrill Rides (TR) and avoided Kiddie Land (KL) games (Figure 1). Clusters were comprised by 5.9, 4.0 and 4.9% of the IDs that attended this weekend. While cluster 11 also showed main preferences for Shows and Entertainments (S&E) of Coaster Alley (CA) zone and secondary preferences for Rides for Everyone (RfE) games and food store 31, cluster 2 showed also main preferences for S&E attractive 64 and secondary preferences for RfE games. It can be inferred that most of these IDs are people who like adventure and risk, that attended with no children. Improvements that we suggest are 1) to place a rest room near Game 3 (TR) because the nearest rest room is about 350 meters from this game, 2) to settle beer gardens between attractive 48 - 31, 81 – 4 and 32 – 57 because the other beer places are far of these zones.

Clusters 5 and 24 are composed by 465 and 554 IDs respectively. Both clusters showed main preference by TR games and while cluster 5 also preferred all S&E games cluster 24 only preferred S&E 64 and 32 (Figure 1). Clusters were comprised by 4.1 and 4.9% of the IDs that attended this weekend. Both clusters showed a regular intermediate preference to the other games). It can be inferred that some of these IDs attended with children and that tried to visit all the attractives of the park. We suggest to place a S&E attractive between TR games 5 and 3 as they are far away from a S&E attractive.

Clusters 10, 20 and 15 are composed by 440, 589 and 459 IDs respectively. They showed clear preferences for some, not all, TR games and for other attractives. While cluster 10 showed preferences for TR games 1, 6, and 4, and for S&E attractive 64, cluster 20 mainly liked to go to TR game 7, to Wet Land (WL) TR and to TR game 5 of the Tundra Land (TL). In a same way, cluster 15 preferred  TR games 8, 7, 1 and 3, S&E attractive 32 and Kiddie Ride (KR) game 11. Clusters were comprised by 2.3, 5.2 and 4.0% of the IDs that attended this weekend.  Other observations about these clusters are that 1) cluster 10 showed intermediate preference to games 12, 15, 18, 8, 32, 63, 23, 81, 3, 5 and 26, 2) cluster 20 showed intermediate preference to CA games and minor preference to KL and TL games, 3) cluster 15 showed minor preference to games 12, 6, 30, 31 and 62 and intermediate preference to the rest of the games. It can be inferred that the people of these clusters attended with children that, in the case of cluster 15, preferred mainly Game11. We suggest to place another rest room and food store between games 4 and 6 because there is low concentration of them in this place.

Among the small groups we have (improvements not be given as we considered unjustifiable to make changes to the park only for minor groups):

Cluster 3 was composed by 33 IDs. It is characterized for showing equal preference for the games whose members attended. They showed preference for KR games 11, 14, 17, for all S&E games, for RfE games 23, 22, 25, 26, for TR games 81, 3, 5 and for food store numbered 31. It was comprised by 0.3% of the IDs that attended this weekend. This group showed no preference for restroom 49, and for information store 62. It can be inferred that it is a group of people that follows the same tour as check in frequencies are identical.

Cluster 9 was composed by 80 IDs. They showed preference for KR games 9, 10, 11, 13, 14 and 15, for TR games 2, 6, 7 and 5, and for RfE of TL zone, mainly games 5, 25, 26. It was comprised by 0.7% of the IDs that attended this weekend. This group showed no preference for KR games 12, 17, 18 and 19, for S&E attractives 64 and 32, for RfE games 24, 30, 22 and 27, for TR games 4 and 81, for restroom 49 and for the information store 62.This group of people came to the park with children and, as there are several games with equal frequency, it is composed by a group of people that follows a same tour.

Cluster 19 was composed by 56 IDs. They showed preference for several games (KR 17, S&E 64, TR 2, 6, 8, 3 and 5, and RfE 23, 26 and 27) and all zones. It was comprised by 0.5% of the IDs that attended this weekend. This group showed no preference for S&E of CA zone and for information store 62. As there are several games (7) with equal frequency it can be inferred that in this cluster there are also a group of people that follows a same tour.

Cluster 13 was composed by 79 IDs. They showed preference for KR games 16, 17 and 18, for TR Games 8, 81 and 5, and for RfE games 21, 22 and 27. It was comprised by 0.7% of the IDs that attended this weekend. This group showed no preference for KR games 9, 10, 11 and 19, for TR games 1 and 2, for RfE games 29, 20, 23, 25, 26 and 28, for S&E attractive 32 and 63, for food store 31, for restroom 49, and for Information store 62. This group of people came to the park with children and, as there are several games (5) with equal frequency, is composed by a group of people that follows a same tour.

 

Figure 1. Selected clusters indicating the relative attendance of their members to different games. Clusters 2, 5, 10, 11, 15, 17, 20, 14 are big clusters while clusters 3, 9, 19, 13 are small.

 

 

MC1.2 – Are there notable differences in the patterns of activity on in the park across the three days?  Please describe the notable difference you see.

 

 

 

 

One of the notable differences is that the amount of people in the park that checked in increased from Friday to Sunday. On Friday there were detected the presence of 47.033 different IDs that checked in while on Saturday and Sunday this amount increased to 73.767 and 84.183, respectively. This increase is seen in all games except S&E attractives 32 and 63 (Figure 2A).

Additionally, the relative preference of IDs to the games was similar between Friday and Saturday but differed regarding Sunday (Figure 2B). On Sunday, relative preferences to S&E attractives 32 and 63 of the CA zone were smaller than the other days as well as happened with TR game 4 of WL. On the contrary, relative preferences to TR games 1, 2, 6, and 8 of CA zone and TR game 63 of TL zone was higher than the other days.

Space use was, in general terms, different between Saturday and Friday and between Sunday and Friday while it was similar between Sunday and Saturday (Figure 3). On Saturday and Sunday, there were more relative frequency of people in most of the north zone of the park than on Friday (Figure 3 upper panels - highlighted in red) while on Friday there were more relative frequency of people in most of the south zone of the park (Figure 3 upper panels - highlighted in blue). Sunday differed mainly from Saturday in the smaller relative frequency of people at attractive 32 (Figure 3 lower panel – highlighted in blue).

 

Figure 2. Absolute (A) and relative (B) attendance of different IDs to the park attractives during Friday, Saturday and Sunday of the weekend under study.

Figure 3. Comparison of space use through the relative frequency difference, between pair of days, of the relative amount of detected signals (movements, check ins) in a given day and square.

 

Figure 4. Relative frequencies of check ins per hour by game and day.

 

 

We also noticed that there is a schedule difference between days (Figure 4). On Friday, activities started at 8 A.M and ended at 9 PM while on Saturday and Sunday they ended at 12 A.M. Another point to be highlighted is that there are two contractions at 10 AM and 3 PM. This is because there is no check in registered in Creighton Pavilion between 10 AM - 11 AM and between 3PM - 4PM in any day. There is also no activity in Grinosaurus Stage between 10 AM - 1 PM and from 4 PM on. Additionally, activity in the park had his highest peak around 4 pm on Friday and Saturday, while on Sunday this occurs at 11AM. After those hours, it seems people used less the attractions and check ins started to decrease until the close of the park. Regarding uncommon behaviors, there is a considerable quantity of check-ins in Raptor restroom between 8AM and 9 AM which possibly means there is another entry to the park in this place.

 

 

MC1.3What anomalies or unusual patterns do you see? Describe no more than 10 anomalies, and prioritize those unusual patterns that you think are most likely to be relevant to the crime.

 

Regarding space use, the southernmost part of Creighton pavilion (attractive 32) showed lower relative use than a normal day while the northwest part of the park showed relative higher use. The main anomaly was the relative lower use of attractive 32, although also lower use, but of less degree, was detected in attractives 4, 5, 7, 63 and 81 (Figure 4A). Last people entered to the Creighton pavilion at 11 am (Figure 4B). We suspect that at that time the crime occurred. Additionally, we consider that when a robbery happen then the thief try to escape with the thing that was robbed avoiding any control (check in). In this sense, 476 IDs were present at the Pavilion at 11am. Of them, six IDs (20098, 221553, 542027, 746574, 1264352, 1770473) did not make any other check in to other attractive of the park. These IDs entered together to the park at 08:42:41 and visited the same games at the same time so we suspect that there is high probability that the crime was done by this band of thieves. However, they did not leave the park inmediatly, spending about 1:15h to exit. Firstly, they went from Creighton pavilion to the exit returning then back to attractive 48 and they back to the exit (Figure 5). Another detected anomaly was the existence of some IDs whose last register was a check in and no other activity was detected hereinafter. As examples we can mention IDs 1932220, 98371 and 1042280.

Another detected anomaly was regarding movement recording. On Friday, park sensors stopped working at 20:12:07h (Figure 6) when there still was some people in the park (Figure 7). Additionally, valid checkouts were not detected for 90,75% of the persons that attended to the park (3228 out of 3557). On Saturday, no movement recording occurred between 23:23:04 and 23:30:19 (Figure 6). On the other hand, only one ID did not register a check out (ID 1975667). It’s last registered movement was at 22:26:47 (Figure 7). Finally, on Sunday, the last registered movement occurred at 23:25:13h and the following IDs did not registered a valid check out: 898576, 1095309, 1376114, 2063022, 1336607, 1483705, 227221, 392618, 1722376 (Figures 6 and 7)

 

Figure 4. A: Comparison of space use between Sunday and a common/normal day, through the relative frequency difference of the relative amount of detected signals (movements, check ins) in a given day and square. B: Relative frequency of check ins to game 32 per day.

 

 

 

Figure 5. Trajectory of the suspected band of thieves after the robbery.

 

 

Figure 6. Movement detection for time intervals by day.

 

Figure 7.People remaining in park after last movement detected. Dots color indicates time of the last movement registered.

 

The last anomaly that we have detected was regarding irregular movements. On Saturday, ID 1983765 registered movements in different sectors of the park at a given same moment of time (Figure 8). This behavior began at 20:18:21h and a checkout was also registered for this ID throughout the Raptor Restroom entrance twice on 20:34:36 and 20:36:07.

 

cloned

Figure 8. Positions during same times of a given same ID.